fix(c++,pt-expt): substitute default_fparam in .pt2 compute#5343
fix(c++,pt-expt): substitute default_fparam in .pt2 compute#5343njzjz merged 3 commits intodeepmodeling:masterfrom
Conversation
…am is empty DeepPotPTExpt::compute() crashed when fparam was empty on a model with has_default_fparam=true because the .pt2 model expects a concrete [1, dim_fparam] tensor. Store default_fparam values in .pt2 metadata and substitute them in C++ when the caller passes an empty fparam vector.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 263de1fdc2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughExports a model-level Changes
Sequence DiagramsequenceDiagram
participant Export as Exporter
participant Serializer as Serializer
participant Model as SavedModel(.pt2/.pte)
participant CPPInit as C++Backend:init
participant CPPCompute as C++Backend:compute
participant User as UserCode
Export->>Serializer: call model.get_default_fparam()
Serializer->>Model: write metadata["default_fparam"]
User->>CPPInit: load model
CPPInit->>Model: read metadata
CPPInit->>CPPInit: set has_default_fparam_ / default_fparam_
User->>CPPCompute: compute(coords, aparam, fparam=empty)
alt caller-provided fparam non-empty
CPPCompute->>CPPCompute: use provided fparam tensor
else default_fparam_ exists
CPPCompute->>CPPCompute: construct fparam_tensor from default_fparam_ (torch::from_blob(...).clone())
else
CPPCompute->>CPPCompute: use empty fparam tensor / error if expected
end
CPPCompute->>User: return energies / forces / virials
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deepmd/pt_expt/utils/serialization.py`:
- Around line 197-198: The metadata currently stores model.get_default_fparam()
(a torch.Tensor for the PT backend) which is not JSON-serializable; change the
code that builds metadata (the dict keys "has_default_fparam" /
"default_fparam") to convert the tensor to a plain Python list (e.g., call
.tolist() or otherwise map it to serializable types) when
model.has_default_fparam() is True so that subsequent json.dumps(...) calls
(used in the .pte/.pt2 export paths) succeed; update usage around the
model.get_default_fparam() call to store a list instead of a tensor.
In `@source/api_cc/src/DeepPotPTExpt.cc`:
- Around line 639-650: The code currently trusts metadata["default_fparam"]
contents; instead validate in init() that metadata contains a JSON array, that
each element is numeric (not null/string/object) and that its length matches
dfparam (or dfparam_.size()) before populating default_fparam_. If any check
fails, throw a deepmd::deepmd_exception with a clear message; only after
validation iterate over metadata["default_fparam"].as_array() and push_back
as_double() into default_fparam_. Use the existing symbols has_default_fparam_,
metadata, default_fparam_, dfparam (or dfparam_), init(), and compute() to
locate and implement these checks.
In `@source/api_cc/tests/test_deeppot_default_fparam_ptexpt.cc`:
- Around line 82-86: The test should skip when the pt2 experiment support itself
is unavailable, not just when BUILD_PYTORCH is off: in SetUp() keep the existing
BUILD_PYTORCH guard and then detect pt2 support before calling dp.init(...pt2);
if DeepPotPTExpt exposes a runtime query (e.g., dp.supports_pt2() or
dp.has_pt2_support()) call that and GTEST_SKIP() when false; if support is only
exposed as a compile-time macro in DeepPotPTExpt.h, check that macro (e.g.,
`#ifndef` DEEP_POT_PT2_ENABLED) and skip similarly—ensure the
dp.init("../../tests/infer/fparam_aparam_default.pt2") call is only reached when
pt2 support is present.
In `@source/tests/infer/gen_fparam_aparam.py`:
- Around line 207-214: The unused local binding v_d from the call to
dp_default.eval (e_d, f_d, v_d, ae_d, av_d = dp_default.eval(...)) triggers Ruff
RUF059; rename it to a throwaway name such as _v_d or _ to indicate it is
intentionally unused (e.g., change v_d to _v_d in the assignment) so the linter
stops flagging it while leaving the rest of the unpacking and the
dp_default.eval call (coord, box, atype, fparam=fparam_val, aparam=aparam_val,
atomic=True) unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: dc8e997f-8806-44eb-bd0f-1b12af631a88
📒 Files selected for processing (6)
deepmd/pt_expt/utils/serialization.pysource/api_cc/include/DeepPotPTExpt.hsource/api_cc/src/DeepPotPTExpt.ccsource/api_cc/tests/test_deeppot_a_fparam_aparam_ptexpt.ccsource/api_cc/tests/test_deeppot_default_fparam_ptexpt.ccsource/tests/infer/gen_fparam_aparam.py
💤 Files with no reviewable changes (1)
- source/api_cc/tests/test_deeppot_a_fparam_aparam_ptexpt.cc
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5343 +/- ##
==========================================
- Coverage 82.28% 82.27% -0.01%
==========================================
Files 797 798 +1
Lines 82100 82214 +114
Branches 4003 4020 +17
==========================================
+ Hits 67557 67644 +87
- Misses 13336 13355 +19
- Partials 1207 1215 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The .pth default_fparam model is committed to git with its own expected values in test_deeppot_default_fparam_pt.cc. Regenerating it from gen_fparam_aparam.py produces different weights, breaking those tests.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
source/tests/infer/gen_fparam_aparam.py (1)
202-209:⚠️ Potential issue | 🟡 MinorRename unused unpacked
v_dto satisfy Ruff (RUF059).At Line 202,
v_dis unpacked but never used, which will fail lint checks.💡 Suggested fix
- e_d, f_d, v_d, ae_d, av_d = dp_default.eval( + e_d, f_d, _v_d, ae_d, av_d = dp_default.eval(As per coding guidelines,
**/*.py: Always runruff check .andruff format .before committing changes or CI will fail.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/infer/gen_fparam_aparam.py` around lines 202 - 209, The tuple returned by dp_default.eval is unpacked into e_d, f_d, v_d, ae_d, av_d but v_d is unused and triggers Ruff RUF059; update the unpack to ignore that value (e.g., replace v_d with _ or _v_d) where dp_default.eval(...) is called so only used symbols e_d, f_d, ae_d, av_d remain, then run ruff check/format to verify.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@source/tests/infer/gen_fparam_aparam.py`:
- Around line 202-209: The tuple returned by dp_default.eval is unpacked into
e_d, f_d, v_d, ae_d, av_d but v_d is unused and triggers Ruff RUF059; update the
unpack to ignore that value (e.g., replace v_d with _ or _v_d) where
dp_default.eval(...) is called so only used symbols e_d, f_d, ae_d, av_d remain,
then run ruff check/format to verify.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a6222669-d76f-4e13-b25f-eb13f13a5fba
📒 Files selected for processing (1)
source/tests/infer/gen_fparam_aparam.py
- Warn at init for older .pt2 models missing default_fparam metadata; explicit fparam callers still work - Throw clear error at compute time if fparam is empty and default values are unavailable (instead of cryptic torch shape mismatch) - Validate default_fparam_ length matches dim_fparam at init - Rename unused v_d to _v_d in gen script
TestInferDeepPotDefaultFParamPtExpt was defined in both test_deeppot_a_fparam_aparam_ptexpt.cc and test_deeppot_default_fparam_ptexpt.cc (upstream deepmodeling#5343). Duplicate class names in the same gtest binary cause undefined behavior. Removed ours since upstream's version is strictly stronger: it checks against absolute expected values (which implies self-consistency). Kept NoDefaultFParamPtExpt (unique to our file).
Summary
default_fparamvalues in.pt2metadata so C++ can read them at init timeDeepPotPTExpt::compute(), substitute stored default values when caller passes empty fparam on a model withhas_default_fparam=truefparam_aparam_default.pt2Test plan
runUnitTests_cc --gtest_filter="*DefaultFParam*PtExpt*"— 10 tests pass (double + float, attrs/empty/explicit/lmp_nlist)runUnitTests_cc --gtest_filter="*PtExpt*"— all 100 PtExpt tests pass, no regressionsfparam_aparam_default.pt2metadata containsdefault_fparam: [0.25852028]gen_fparam_aparam.pybefore C++ testsSummary by CodeRabbit